50,339 research outputs found

    The Generalized Asymptotic Equipartition Property: Necessary and Sufficient Conditions

    Full text link
    Suppose a string X1n=(X1,X2,...,Xn)X_1^n=(X_1,X_2,...,X_n) generated by a memoryless source (Xn)n≥1(X_n)_{n\geq 1} with distribution PP is to be compressed with distortion no greater than D≥0D\geq 0, using a memoryless random codebook with distribution QQ. The compression performance is determined by the ``generalized asymptotic equipartition property'' (AEP), which states that the probability of finding a DD-close match between X1nX_1^n and any given codeword Y1nY_1^n, is approximately 2−nR(P,Q,D)2^{-n R(P,Q,D)}, where the rate function R(P,Q,D)R(P,Q,D) can be expressed as an infimum of relative entropies. The main purpose here is to remove various restrictive assumptions on the validity of this result that have appeared in the recent literature. Necessary and sufficient conditions for the generalized AEP are provided in the general setting of abstract alphabets and unbounded distortion measures. All possible distortion levels D≥0D\geq 0 are considered; the source (Xn)n≥1(X_n)_{n\geq 1} can be stationary and ergodic; and the codebook distribution can have memory. Moreover, the behavior of the matching probability is precisely characterized, even when the generalized AEP is not valid. Natural characterizations of the rate function R(P,Q,D)R(P,Q,D) are established under equally general conditions.Comment: 19 page

    Conservative Hypothesis Tests and Confidence Intervals using Importance Sampling

    Full text link
    Importance sampling is a common technique for Monte Carlo approximation, including Monte Carlo approximation of p-values. Here it is shown that a simple correction of the usual importance sampling p-values creates valid p-values, meaning that a hypothesis test created by rejecting the null when the p-value is <= alpha will also have a type I error rate <= alpha. This correction uses the importance weight of the original observation, which gives valuable diagnostic information under the null hypothesis. Using the corrected p-values can be crucial for multiple testing and also in problems where evaluating the accuracy of importance sampling approximations is difficult. Inverting the corrected p-values provides a useful way to create Monte Carlo confidence intervals that maintain the nominal significance level and use only a single Monte Carlo sample. Several applications are described, including accelerated multiple testing for a large neurophysiological dataset and exact conditional inference for a logistic regression model with nuisance parameters.Comment: 26 pages, 3 figures, 3 tables [significant rewrite of version 1, including additional examples, title change

    Estimation of the Rate-Distortion Function

    Full text link
    Motivated by questions in lossy data compression and by theoretical considerations, we examine the problem of estimating the rate-distortion function of an unknown (not necessarily discrete-valued) source from empirical data. Our focus is the behavior of the so-called "plug-in" estimator, which is simply the rate-distortion function of the empirical distribution of the observed data. Sufficient conditions are given for its consistency, and examples are provided to demonstrate that in certain cases it fails to converge to the true rate-distortion function. The analysis of its performance is complicated by the fact that the rate-distortion function is not continuous in the source distribution; the underlying mathematical problem is closely related to the classical problem of establishing the consistency of maximum likelihood estimators. General consistency results are given for the plug-in estimator applied to a broad class of sources, including all stationary and ergodic ones. A more general class of estimation problems is also considered, arising in the context of lossy data compression when the allowed class of coding distributions is restricted; analogous results are developed for the plug-in estimator in that case. Finally, consistency theorems are formulated for modified (e.g., penalized) versions of the plug-in, and for estimating the optimal reproduction distribution.Comment: 18 pages, no figures [v2: removed an example with an error; corrected typos; a shortened version will appear in IEEE Trans. Inform. Theory

    Exact Enumeration and Sampling of Matrices with Specified Margins

    Full text link
    We describe a dynamic programming algorithm for exact counting and exact uniform sampling of matrices with specified row and column sums. The algorithm runs in polynomial time when the column sums are bounded. Binary or non-negative integer matrices are handled. The method is distinguished by applicability to non-regular margins, tractability on large matrices, and the capacity for exact sampling

    Backing the horse or the jockey? Due diligence, agency costs, information and the evaluation of risk by business angel investors

    Get PDF
    This paper explores the argument that business angel investors are more concerned with managing and minimising agency risk than market risk. Based on data on the due diligence process from a survey of business angels in the UK, the paper concludes that business angels do view entrepreneur characteristics and experience as having the greatest impact on the perceived riskiness of an investment opportunity. Further, they emphasise personal and informal over formal sources of information in the due diligence process, and seek information on both the entrepreneur and the venture in determining valuation. Indeed, the reliance of business angels on short-term and subjective information to value investment opportunities leads to the conclusion that their approach to valuation is not a function of the conventional protocols of financial analysis, but of personal relations and assessment

    Different cation-protonation patterns in mol-ecular salts of unsymmetrical dimethyhydrazine : C2H9N2·Br and C2H9N2·H2PO3

    Get PDF
    Acknowledgements We thank the EPSRC National Crystallography Service (University of Southampton) for the data collections.Peer reviewedPublisher PD
    • …
    corecore